feat(babel)!: add parallel processing via worker threads#1956
feat(babel)!: add parallel processing via worker threads#1956davidtaylorhq wants to merge 2 commits intorollup:masterfrom
Conversation
- drops support for rollup 1.x, which allows simplifying the implementation - implements 'plugin hook filters' for improved performance under rolldown - passes a new `code` argument to custom filter functions, so that consumers can add more advanced filtering logic - allows custom filter functions to be async
Add a `parallel` option that processes files concurrently using Node.js worker threads. This reduces build times for large projects where Babel transformation is a bottleneck. This is similar to the existing parallel behavior of `@rollup/plugin-terser`. This required some fairly significant refactoring, because we can only pass serializable objects between the main thread and the worker threads. It also required changes to the plugin's own build config, so that we can generate a dedicated worker entrypoint. Validations are added to ensure that unserializable config (e.g. inline babel plugins) cannot be used alongside the new parallel mode. For people using dedicated babel config files, this isn't a problem, because they are loaded directly by babel in the worker thread itself. The worker threads do have a setup cost, so this only makes sense for large projects. In Discourse, enabling this parallel mode cuts our overall vite (rolldown) build time by about 45% (from ~11s to ~6s) on my machine.
|
@Andarist @shellscape sorry for the ping, but just wanted to check if this is something you'd consider looking at? The performance benefits are pretty massive for large projects using Babel. |
|
I'm really looking forward to this landing! I have a local tarball that I've been copying into my projects to have the benefits of this PR - and it's really good. I can no longer imagine using babel in rollup/vite/rolldown without the parallel mode <3 (usage without parallel mode is painfully slow) |
|
|
||
| const parallelWorkerCount = typeof parallel === 'number' ? parallel : 4; | ||
| workerPool = new WorkerPool( | ||
| new URL('./worker.js', import.meta.url).pathname, |
There was a problem hiding this comment.
is this compatible with repo-wide supported node.js version?
There was a problem hiding this comment.
Worker Threads and import.meta.url are both supported in Node 14 👍
There was a problem hiding this comment.
have you considered using some existing worker pool libraries?
There was a problem hiding this comment.
I aimed to match the implementation of the existing Terser plugin, since I wasn't sure if adding new dependencies would be welcome. I'm certainly happy to explore it though! Did you have any particular libraries in mind?
workerpool looks like it could be a good bet 👀
There was a problem hiding this comment.
I guess it's fine to keep it as-is if it matches the one in the terser plugin. It would be nice to extract this to a shared package but I don't know what's the proper drill for this. It probably could/should be extracted once this PR lands.
|
@CharlieHelps please do a thorough review of this PR to determine if it's ready to merge. Look at the diff, reviews, comments, and referenced issues/docs/links to build a complete understanding. Then respond with a comment that includes a list of MUST fix items that must be fixed before the the PR is merged, and a list of IMPROVEMENTS that shouldn't block the merge but may be quick improvements or things that could be addressed in a follow up PR. The MUST fix items should explain exactly what's wrong and what needs to change to resolve the problem. Both should be numbered lists and the IMPROVEMENTS list should continue counting up from the MUST list. It's OK if there are no IMPROVEMENTS or MUST fix items, you can just omit the lists. |
|
@davidtaylorhq looks like Github's buggy UX bit us on this and I can't see/approve the CI runs. please push a new commit to the branch to trigger that UX again. |
|
@shellscape you can close and re-open a PR to trigger that, usually -- no reason to make a contributor do a silly because GH is buggy <3 |
|
I pulled the branch and reviewed the diff + existing review comments. Local verification for MUST fix
IMPROVEMENTS
|
| footer(chunkInfo) { | ||
| if (chunkInfo.name === 'index') { | ||
| return 'module.exports = Object.assign(exports.default, exports);'; | ||
| } | ||
| return null; |
There was a problem hiding this comment.
why do we need this now? I don't see any change in the index.js that would affect the package's shape
There was a problem hiding this comment.
I had to refactor rollup.config.mjs to add the worker entrypoint. So this code is re-implementing this functionality from the shared config which was being used before:
plugins/shared/rollup.config.mjs
Line 28 in 7d16103
There was a problem hiding this comment.
q: have you tested the plugin works now correctly in both CJS and ESM modes?
| "@babel/core": "^7.0.0", | ||
| "@types/babel__core": "^7.1.9", | ||
| "rollup": "^1.20.0||^2.0.0||^3.0.0||^4.0.0" | ||
| "rollup": "^2.0.0||^3.0.0||^4.0.0" |
There was a problem hiding this comment.
@shellscape this required a major bump of this package - we should ensure that the automation will be able to bump this properly. I think that's achieved using some special tags in commit messages
There was a problem hiding this comment.
yeah including BREAKING CHANGES in the commit body or changing the PR title to reflect a major version in the conventional commit spec.
| * @default undefined; | ||
| */ | ||
| filter?: ReturnType<CreateFilter>; | ||
| filter?: (id: string, code: string) => Promise<boolean>; |
There was a problem hiding this comment.
This PR is based on #1954, which should ideally be merged first. Would you prefer that I drop the other one? (they both touch the same code, so it would require some unpicking to separate them)
| let transformOptions = !overrides.config | ||
| let transformOptions = !overrides?.config | ||
| ? config.options | ||
| : await overrides.config.call(ctx, config, { |
There was a problem hiding this comment.
this would technically be a breaking change, right? we should call it out
There was a problem hiding this comment.
Ah, sorry - the function already comes prebound now. So there is no change here really. Those overrides can't be handled here in the parallel mode, right?
There was a problem hiding this comment.
It's not a breaking change, because the function is now bound to ctx earlier:
There was a problem hiding this comment.
Those overrides can't be handled here in the parallel mode, right?
Correct, and an error will be emitted if parallel: true is set alongside these options
| transformOptions = | ||
| babelHelpers === BUNDLED | ||
| ? addBabelPlugin(transformOptions, bundledHelpersPlugin) | ||
| : transformOptions; |
There was a problem hiding this comment.
Hm, so this now relies on babelHelpers input to skip the addBabelPlugin call in the renderChunk's case, right?
| }, | ||
| customOptions, | ||
| error: this.error.bind(this), | ||
| runPreflightCheck: !skipPreflightCheck, |
There was a problem hiding this comment.
IMHO, this conversion make the code slightly harder to follow. We should standardize on a single variable name for this
| return workerPool.runTask({ | ||
| inputCode: code, | ||
| babelOptions: { ...babelOptions, filename }, | ||
| runPreflightCheck: !skipPreflightCheck, | ||
| babelHelpers | ||
| }); |
There was a problem hiding this comment.
q: I'm not sure what we actually get from this.error but maybe it's worth rechecking if we shouldn't "convert" the errors from this to this.error calls?
| if (parallel) { | ||
| const parallelAllowed = | ||
| isSerializable(babelOptions) && !overrides?.config && !overrides?.result; |
There was a problem hiding this comment.
perhaps it would be nice to move this validation to unpackInputPluginOptions? It would also be quite great if the thrown error message could mention the offending option.
| } | ||
|
|
||
| return transformCode(code, babelOptions, overrides, customOptions, this); | ||
| return transformCode({ |
There was a problem hiding this comment.
q: why we don't attempt to parallelize this one?
Rollup Plugin Name:
@rollup/plugin-babelThis PR contains:
Are tests included?
Breaking Changes?
If yes, then include "BREAKING CHANGES:" in the first commit message body, followed by a description of what is breaking.
List any relevant issue numbers: n/a
Description
feat(babel): add parallel processing via worker threads
Add a
paralleloption that processes files concurrently using Node.js worker threads. This reduces build times for large projects where Babel transformation is a bottleneck. This is similar to the existing parallel behavior of@rollup/plugin-terser.This required some fairly significant refactoring, because we can only pass serializable objects between the main thread and the worker threads. It also required changes to the plugin's own build config, so that we can generate a dedicated worker entrypoint.
Validations are added to ensure that unserializable config (e.g. inline babel plugins) cannot be used alongside the new parallel mode. For people using dedicated babel config files, this isn't a problem, because they are loaded directly by babel in the worker thread itself.
The worker threads do have a setup cost, so this only makes sense for large projects. In Discourse, enabling this parallel mode cuts our overall vite (rolldown) build time by about 45% (from ~11s to ~6s) on my machine.
This PR is based on #1954, which should ideally be merged first. Happy to rebase this one on
mainif the other PR isn't mergable.For testing, I've pushed a built version of this branch to https://github.com/davidtaylorhq/plugins/tree/babel-parallel-built. In pnpm, it can be installed like: